AITopics

2502.07555

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia > Hanover County (0.04)
(16 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-19-2024

R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation

Ye, Fuda, Li, Shuangyin, Zhang, Yongqi, Chen, Lei

Retrieval augmented generation (RAG) has been applied in many scenarios to augment large language models (LLMs) with external documents provided by retrievers. However, a semantic gap exists between LLMs and retrievers due to differences in their training objectives and architectures. This misalignment forces LLMs to passively accept the documents provided by the retrievers, leading to incomprehension in the generation process, where the LLMs are burdened with the task of distinguishing these documents using their inherent knowledge. This paper proposes R$^2$AG, a novel enhanced RAG framework to fill this gap by incorporating Retrieval information into Retrieval Augmented Generation. Specifically, R$^2$AG utilizes the nuanced features from the retrievers and employs a R$^2$-Former to capture retrieval information. Then, a retrieval-aware prompting strategy is designed to integrate retrieval information into LLMs' generation. Notably, R$^2$AG suits low-source scenarios where LLMs and retrievers are frozen. Extensive experiments across five datasets validate the effectiveness, robustness, and efficiency of R$^2$AG. Our analysis reveals that retrieval information serves as an anchor to aid LLMs in the generation process, thereby filling the semantic gap.

llm, retrieval information, similarity, (15 more...)

2406.13249

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Hawaii (0.04)
(11 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine (1.00)
Energy (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-16-2024

Pre-training Cross-lingual Open Domain Question Answering with Large-scale Synthetic Supervision

Jiang, Fan, Drummond, Tom, Cohn, Trevor

Cross-lingual open domain question answering (CLQA) is a complex problem, comprising cross-lingual retrieval from a multilingual knowledge base, followed by answer generation in the query language. Both steps are usually tackled by separate models, requiring substantial annotated datasets, and typically auxiliary resources, like machine translation systems to bridge between languages. In this paper, we show that CLQA can be addressed using a single encoder-decoder model. To effectively train this model, we propose a self-supervised method based on exploiting the cross-lingual link structure within Wikipedia. We demonstrate how linked Wikipedia pages can be used to synthesise supervisory signals for cross-lingual retrieval, through a form of cloze query, and generate more natural questions to supervise answer generation. Together, we show our approach, \texttt{CLASS}, outperforms comparable methods on both supervised and zero-shot language adaptation settings, including those using machine translation.

question word, question word and answer, transformed question, (13 more...)

2402.16508

Country:

North America > United States > Washington > King County > Seattle (0.14)
Oceania > Australia > Victoria > Melbourne (0.14)
Europe > Russia (0.04)
(21 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kumar-Singh, Niraj, Polleti, Gustavo, Paliwal, Saee, Hodos-Nkhereanye, Rachel

LinkLogic: A New Method and Benchmark for Explainable Knowledge Graph Predictions

arXiv.org Artificial IntelligenceJun-2-2024

While there are a plethora of methods for link prediction in knowledge graphs, state-of-the-art approaches are often black box, obfuscating model reasoning and thereby limiting the ability of users to make informed decisions about model predictions. Recently, methods have emerged to generate prediction explanations for Knowledge Graph Embedding models, a widely-used class of methods for link prediction. The question then becomes, how well do these explanation systems work? To date this has generally been addressed anecdotally, or through time-consuming user research. In this work, we present an in-depth exploration of a simple link prediction explanation method we call LinkLogic, that surfaces and ranks explanatory information used for the prediction. Importantly, we construct the first-ever link prediction explanation benchmark, based on family structures present in the FB13 dataset. We demonstrate the use of this benchmark as a rich evaluation sandbox, probing LinkLogic quantitatively and qualitatively to assess the fidelity, selectivity and relevance of the generated explanations. We hope our work paves the way for more holistic and empirical assessment of knowledge graph prediction explanation methods in the future.

explanation, linklogic, relation, (12 more...)

2406.00855

Country:

Europe > Hungary (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(29 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Neural Information Processing SystemsMar-14-2024, 02:13:51 GMT

Topic-Partitioned Multinetwork Embeddings

We introduce a new Bayesian admixture model intended for exploratory analysis of communication networks--specifically, the discovery and visualization of topic-specific subnetworks in email data sets. Our model produces principled visualizations of email networks, i.e., visualizations that have precise mathematical interpretations in terms of our model and its relationship to the observed data. We validate our modeling assumptions by demonstrating that our model achieves better link prediction performance than three state-of-the-art network models and exhibits topic coherence comparable to that of latent Dirichlet allocation. We showcase our model's ability to discover and visualize topic-specific communication patterns using a new email data set: the New Hanover County email network. We provide an extensive analysis of these communication patterns, leading us to recommend our model for any exploratory analysis of email networks or other similarly-structured communication data. Finally, we advocate for principled visualization as a primary objective in the development of new network models.

communication pattern, email, email network, (16 more...)

Country:

North America > United States > Virginia > Hanover County (0.25)
North America > United States > North Carolina > New Hanover County (0.25)
Asia > Middle East > Jordan (0.05)
(2 more...)

Industry:

Government (0.68)
Information Technology (0.68)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.35)

Meng, Shuochuan, Soleimani-Babakamali, Mohammad Hesam, Taciroglu, Ertugrul

Automatic Roof Type Classification Through Machine Learning for Regional Wind Risk Assessment

arXiv.org Artificial IntelligenceMay-26-2023

Roof type is one of the most critical building characteristics for wind vulnerability modeling. It is also the most frequently missing building feature from publicly available databases. An automatic roof classification framework is developed herein to generate high-resolution roof-type data using machine learning. A Convolutional Neural Network (CNN) was trained to classify roof types using building-level satellite images. The model achieved an F1 score of 0.96 on predicting roof types for 1,000 test buildings. The CNN model was then used to predict roof types for 161,772 single-family houses in New Hanover County, NC, and Miami-Dade County, FL. The distribution of roof type in city and census tract scales was presented. A high variance was observed in the dominant roof type among census tracts. To improve the completeness of the roof-type data, imputation algorithms were developed to populate missing roof data due to low-quality images, using critical building attributes and neighborhood-level roof characteristics.

artificial intelligence, machine learning, roof type, (18 more...)

2305.17315

Country:

North America > United States > Florida > Dade County (0.34)
North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Florida > Miami-Dade County (0.27)
(6 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable (0.94)
Banking & Finance > Real Estate (0.68)
Information Technology > Security & Privacy (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsApr-6-2023, 12:11:13 GMT

Topic-Partitioned Multinetwork Embeddings

We introduce a joint model of network content and context designed for exploratory analysis of email networks via visualization of topic-specific communication patterns. Our model is an admixture model for text and network attributes which uses multinomial distributions over words as mixture components for explaining text and latent Euclidean positions of actors as mixture components for explaining network attributes. We demonstrate the capability of our model for descriptive, explanatory, and exploratory analysis by investigating the inferred topic-specific communication patterns of a new government email dataset, the New Hanover County email corpus.

exploratory analysis, topic-partitioned multinetwork embedding, topic-specific communication pattern, (1 more...)

Country:

North America > United States > Virginia > Hanover County (0.31)
North America > United States > North Carolina > New Hanover County (0.31)

Technology: Information Technology > Artificial Intelligence (0.34)

Krafft, Peter, Moore, Juston, Desmarais, Bruce, Wallach, Hanna M.

Topic-Partitioned Multinetwork Embeddings

Neural Information Processing SystemsFeb-15-2020, 00:13:22 GMT

exploratory analysis, topic-partitioned multinetwork embedding, topic-specific communication pattern, (1 more...)

Country:

North America > United States > Virginia > Hanover County (0.31)
North America > United States > North Carolina > New Hanover County (0.31)

Technology: Information Technology > Artificial Intelligence (0.61)

Krafft, Peter, Moore, Juston, Desmarais, Bruce, Wallach, Hanna M.

Topic-Partitioned Multinetwork Embeddings

Neural Information Processing SystemsDec-31-2012

We introduce a new Bayesian admixture model intended for exploratory analysis ofcommunication networks--specifically, the discovery and visualization of topic-specific subnetworks in email data sets. Our model produces principled visualizations ofemail networks, i.e., visualizations that have precise mathematical interpretations in terms of our model and its relationship to the observed data. We validate our modeling assumptions by demonstrating that our model achieves better link prediction performance than three state-of-the-art network models and exhibits topic coherence comparable to that of latent Dirichlet allocation. We showcase our model's ability to discover and visualize topic-specific communication patternsusing a new email data set: the New Hanover County email network. We provide an extensive analysis of these communication patterns, leading us to recommend our model for any exploratory analysis of email networks or other similarly-structured communication data. Finally, we advocate for principled visualization asa primary objective in the development of new network models.

artificial intelligence, email network, natural language, (18 more...)

Country:

North America > United States > Massachusetts (0.28)
North America > United States > Virginia > Hanover County (0.25)
North America > United States > North Carolina > New Hanover County (0.25)

Industry:

Information Technology (0.68)
Government (0.47)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.35)